Using Runtime Measurements and Historical Traces for Acquiring Knowledge in Parallel Applications
نویسندگان
چکیده
A new approach for acquiring knowledge of parallel applications regarding resource usage and for searching similarity on workload traces is presented. The main goal is to improve decision making in distributed system software scheduling, towards a better usage of system resources. Resource usage patterns are defined through runtime measurements and a self-organizing neural network architecture, yielding an useful model for classifying parallel applications. By means of an instance-based algorithm, it is produced another model which searches for similarity in workload traces aiming at making predictions about some attribute of a new submitted parallel application, such as run time or memory usage. These models allow effortless knowledge updating at the occurrence of new information. The paper describes these models as well as the results obtained applying these models to acquiring knowledge in both synthetic and real applications traces.
منابع مشابه
HAEC-SIM: a simulation framework for highly adaptive energy-efficient computing platforms
This work presents a new trace-based parallel discrete event simulation framework designed for predicting the behavior of a novel computing platform running energy-aware parallel applications. Discrete event traces capture the runtime behavior of parallel applications on existing systems and form the basis for the simulation. The simulation framework processes the events of the input trace by a...
متن کاملScoPred-Scalable User-Directed Performance Prediction Using Complexity Modeling and Historical Data
Using historical information to predict future runs of parallel jobs has shown to be valuable in job scheduling. Trends toward more flexible jobscheduling techniques such as adaptive resource allocation, and toward the expansion of scheduling to grids, make runtime predictions even more important. We present a technique of employing both a user’s knowledge of his/her parallel application and hi...
متن کاملA Trace-Scaling Agent for Parallel Application TracingJ
requirement for trace files. We show that the agent can obtain such an understanding automatically at runtime without programmer intervention or support. The remainder of the paper is structured as follows: In section 2 we describe scalability problems of tracing mechanisms. Section 3 shows the implementation of the trace-scaling agent. Section 4 describes some applications and results of scale...
متن کاملAutomatic search for patterns of inefficient behavior in parallel applications
Event tracing is a powerful method of analyzing the performance behavior of parallel applications. Because event traces record the temporal and spatial relationships between individual runtime events, they allow application developers to analyze dependences of performance phenomena across concurrent control flows. However, in view of the large amounts of data generated on contemporary parallel ...
متن کاملIntegrated Runtime Measurement Summarisation and Selective Event Tracing for Scalable Parallel Execution Performance Diagnosis
Straightforward trace collection and processing becomes increasingly challenging and ultimately impractical for more complex, longrunning, highly-parallel applications. Accordingly, the kojak measurement system for mpi, openmp and shmem parallel applications is incorporating runtime management and summarisation capabilities. This offers a more scalable and effective profile of parallel executio...
متن کامل